MEC282.001AUS PATENT 
METHOD AND APPARATUS FOR MULTI-USER MULTI-INPUT MULTI-OUTPUT 

TRANSMISSION 



Related Applications 

[0001] The present application claims priority to and the benefit of, under 35 
U.S.C. §1 19(e), U.S. Provisional Application No. 60/405,759, filed August 22, 2002, which 
is hereby incorporated by reference. 

Background of the Invention 

Field of the Invention 

[0002] The present invention relates to a method for multi-user MIMO 
transmission, more in particular, a method for transmission between a base station and U (>1) 
user terminals, said base station and user terminals each equipped with more than one 
antenna, preferably in conjunction with considering the optimizing of joint transmit and 
receive filters, for instance in a MMSE context. Further disclosed are base station and user 
terminal devices suited for execution of said method. 

Description of the Related Technology 

[0003] Multi-input multi-output (MEMO) wireless communications have attracted 
a lot of interest in the recent years as they offer a multiplicity of spatial channels for the radio 
links, hence provide a significant capacity or diversity increase compared to conventional 
single antenna communications. 

[0004] Multi-Input Multi-Output (MEMO) wireless channels have significantly 
higher capacities than conventional Single-Input Single-Output (SISO) channels. These 
capacities are related to the multiple parallel spatial subchannels that are opened through the 
use of multiple antennas at both the transmitter and the receiver. Spatial Multiplexing (SM) is 
a technique that transmits parallel independent data-streams on these available spatial 
subchannels in an attempt to approach the MEMO capacities. 
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[0005] In addition, Spatial-Division Multiple Access (SDMA) is very appealing 
due to its inherent reuse (simultaneously for various users due to the exploitation of the 
distinct spatial signatures of the users) of the precious frequency bandwidth. 

[0006] Several MIMO approaches can be followed which can be classified 
according to whether or not they require channel knowledge at either the transmitter or the 
receiver. Typically, the best performance can be obtained when the channel is known at both 
sides. 

[0007] The optimal solution is provided by SVD weights combined with a water- 
pouring strategy. However, this strategy must adaptively control the number of streams and 
also the modulation and coding in each stream, which makes it inconvenient for wireless 
channels. 

[0008] A sub-optimal approach consists of using a fixed number of data streams 
and identical modulation and coding as in a single-user joint transmit-receive (TX-RX) 
MMSE optimization [H. Sampath and A. Paulraj, "Joint TX & RX Optimization for High 
Data Rate Wireless Communication Using Multiple Antennas", Asilomar conf. On signals, 
systems and computers, pp. 215-219, Asilomar, California, November 1999]. This latter 
solution is more convenient but is not directly applicable to SDMA MIMO communications 
where a multi-antenna base station communicates at the same time with several multi- 
antenna terminals. Indeed, the joint TX-RX optimization requires channel knowledge at both 
sides, which is rather unfeasible at the terminal side (the terminal only knows its part of the 
multi-user wireless channel). 

[0009] To approach the potential MIMO capacity while optimizing the system 
performance, several joint TX/RX MMSE designs have been proposed. 

[0010] Two main design trends have emerged that enable Spatial Multiplexing 
corresponding to whether Channel State Information (CSI) is available at the transmitter. On 
the one hand, BLAST-like space-time techniques make use of the available transmit antennas 
to transmit as many independent streams and do not require CSI at the transmitter. On the 
other hand, the joint transmit and receive space-time processing takes advantage of the 
potentially available CSI at both sides of the link to maximize the system ! s information rate 
or alternatively optimize the system performance, under a fixed rate constraint. 



-2- 



[0011] Within multi-user MMO transmission schemes multi-user interference 
results in a performance limitation. Further, the joint determination of optimal filters for both 
the base station and the user terminals in the case of a multi-user context is very complex. 

Summary of Certain Inventive Aspects 

[0012] Embodiments of the present invention provide a solution for the problem 
of multi-user interference in a multi-user MMO transmission scheme, which results in a 
reasonably complex filter determination in the case of joint optimal filter determination, 
although the invention is not limited thereto. 

[0013] The invention includes a method of multi-user MMO transmission of data 
signals from at least one transmitting terminal with a spatial diversity capability to at least 
two receiving user terminals, each provided with spatial diversity receiving capability, 
comprising: dividing said data signals into a plurality of streams of (sub-user) data sub- 
signals; determining combined data signals in said transmitting terminal, said combined data 
signals being transformed versions of said streams of data sub-signals, such that at least one 
of said spatial diversity devices of said receiving user terminals only receives data sub-signals 
being specific for the corresponding receiving user terminal; inverse subband processing said 
combined data signals; transmitting with said spatial diversity device said inverse subband 
processed combined data signals; receiving on at least one of said spatial diversity receiving 
device of at least one of said receiving terminals received data signals, being at least a 
function of said inverse subband processed combined data signals; determining on at least 
one of said receiving terminals estimates of said data sub-signals from said received data 
signals; and collecting said estimates of said data sub-signals into estimates of said data 
signals. 

[0014] In certain embodiments, the transmission of the inverse subband processed 
combined data signals is performed in a substantially simultaneous way. Typically, the 
spectra of the inverse subband processed combined data signals are at least partly 
overlapping. 

[0015] In some embodiments, the step of determining combined data signals in 
the transmitting terminal is carried out on a subband by subband basis. In other 



embodiments, the step of determining the estimates of said data sub-signals in the receiving 
terminals comprises subband processing. 

[0016] The determining combined data signals in the transmitting terminal may 
additionally comprise: determining intermediate combined data signals by subband 
processing of the data signals; and determining the combined data signals from the 
intermediate combined data signals. 

[0017] In other embodiments, the subband processing includes orthogonal 
frequency division demultiplexing and the inverse subband processing includes orthogonal 
frequency division multiplexing. 

[0018] In certain embodiments, the method includes subbands that are involved in 
inverse subband processing being grouped into sets, whereby at least one set includes at least 
two subbands and the step of determining combined data signals in the transmitting terminal 
comprises: determining relations between the data signals and the combined data signals on a 
set-by-set basis; and exploiting the relations between the data signals and the combined data 
signals for determining the data signals. 

[0019] A guard interval may be introduced in the inverse subband processed 
combined data signals. 

[0020] The determining combined data signals may further comprise transmitter 
filtering. Wherein the determining estimates of the sub-signals comprises receiver filtering, 
said transmitter filtering and said receiver filtering being determined on a user-by-user basis. 

[0021] In certain embodiments, the number of streams of data sub-signals is 
variable. In other embodiments, the number of streams is selected in order to minimize the 
error between the estimates of the data sub-signals and the data sub-signals themselves. 
Alternatively, the number of streams may be selected in order to minimize the system bit 
error rate. 

[0022] Another aspect includes a method of transmitting data signals from at least 
two transmitting terminals each provided with spatial diversity transmitting device to at least 
one receiving terminal with a spatial diversity receiving device comprising: dividing said data 
signals into a plurality of streams of (sub-user) data sub-signals; transforming versions of said 
streams of said data sub-signals into transformed data signals; transmitting from said 
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transmitting terminals said transformed data signals; receiving on said spatial diversity 
receiving device received data signals being at least function of at least two of said 
transformed data signals; subband processing of at least two of said received data signals in 
said receiving terminal; applying a linear filtering on said subband processed received data 
signals, said linear filtering and said transforming being selected such that the filtered 
subband processed received data signals are specific for one of said transmitting terminals; 
determining estimates of said data sub-signals from said filtered subband processed received 
data signals in said receiving terminal; and collecting said estimates of said data sub-signals 
into estimates of said data signals. 

[0023] In certain embodiments, the transmission is substantially simultaneous and 
the spectra of the transformed data signals are at least partly overlapping. Additionally, the 
transformation of the data sub-signals to transformed data sub-signals may comprise inverse 
subband processing. 

[0024] In an alternative embodiment, the determining estimates of the data sub- 
signals from subband processed received data signals in the receiving terminal comprises: 
determining intermediate estimates of the data sub-signals from the subband processed 
received data signals in the receiving terminal; and obtaining the estimates of the data sub- 
signals by inverse subband processing the intermediate estimates. 

[0025] Another aspect includes an apparatus for transmitting inverse subband 
processed combined data signals to at least one receiving user terminal with spatial diversity 
device comprising at least: at least one spatial diversity transmitter; circuitry configured to 
divide data signals into streams of data sub-signals; circuitry configured to combine data 
signals, such that at least one of said spatial diversity device of said receiving user terminals 
only receives data sub-signals being specific for the corresponding receiving user terminal; 
circuitry being adapted for inverse subband processing combined data signals; and circuitry 
being adapted for transmitting inverse subband processed combined data signals with said 
spatial diversity device. 

[0026] Additionally, the circuitry is configured to combine data signals and may 
comprise a plurality of circuits, each configured to combine data signals based at least on part 
of the subbands of the data sub-signals. 
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[0027] In other embodiments, the spatial diversity transmitter comprises at least 
two transmitters and the circuitry configured to transmit inverse subband processed combined 
data signals comprises a plurality of circuits configured to transmit the inverse subband 
processed combined data signals with one of the transmitters of the spatial diversity device. 

[0028] Yet another aspect includes an apparatus for transmitting data signals to at 
least one receiving terminal with the spatial diversity device, comprising at least: at least one 
spatial diversity transmitter; circuitry configured to divide data signals into streams of data 
sub-signals; circuitry configured to transform versions of the data sub-signals; and circuitry 
configured to transmit with the spatial diversity device the transformed versions of the data 
sub-signals, such that at least one of the spatial diversity devices of the receiving terminal 
only receives specific received data sub-signals. 

[0029] A further aspect includes a method to calibrate a transceiver for wireless 
communication comprising at least one transmitter/receiver pair connected to an antenna 
branch, such that front-end mismatches in the transmitter/receiver pair can be compensated, 
comprising: providing a splitter, a directional coupler, a transmit/receive/calibration switch, a 
calibration noise source and a power splitter; matching the power splitter outputs between all 
branches of the transceiver, matching the directional couplers and matching the 
transmit/receive/calibration switches between all antenna branches of the transceiver; 
switching on the calibration connection of the transmit/receive/calibration switch; in each of 
the antenna branches, generating a known signal and calculating an averaged frequency 
response of the cascade of the transmitter and the receiver of the transmitter/receiver pair; 
connecting the transmit/receive/calibration switch so as to isolate the receiver from both the 
transmitter and the antenna; switching on the calibration noise source; calculating an 
averaged frequency response of all receiver branches of the transceiver; determining the 
values to be pre-compensated from the calculated averaged frequency responses of the 
cascade of the transmitter and the receiver of the transmitter/receiver pair and of all receiver 
branches of said transceiver; and pre-compensating the transmitter/receiver pair using the 
inverse of the values. 

[0030] In certain embodiments, the transceiver is a base station transceiver. In 
addition, the pre-compensating may be performed digitally. 
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Brief Description of the Drawings 
[0031] Fig. 1 illustrates a downlink communication set-up. 
[0032] Fig. 2 illustrates an uplink communication set-up. 

[0033] Fig. 3 illustrates the multi-user (U>1), MMO (A>1, B u >1), multi-stream 
(C u >1) context of the systems and methods. 

[0034] Fig. 4 illustrates the involved matrices. 

[0035] Fig. 5 and 6 illustrate simulations results for the block diagionalization 
approach used in the context of joined transmit and receive filter optimization. 

[0036] Fig. 7 illustrates the matrix dimension for 8 antennas at the base station. 

[0037] Fig. 8 illustrates a Spatial Multiplexing MIMO System. 

[0038] Fig. 9 illustrates the existence (a) and distribution (b) of the optimal 
number of streams p opt for a (6,6) MMO system. 

[0039] Fig. 10 illustrates p opX 's distribution for different reference rates R. 

[0040] Fig. 1 1 illustrates the MSEp versus p for different SNR levels. 

[0041] Fig. 12 illustrates a comparison between the exact MSEp and the 
simplified one. 

[0042] Fig. 13 illustrates a comparison between the BER performance of the 
spatially optimized and conventional Tx/Rx MMSE. 

[0043] Fig. 14 illustrates a comparison of the spatially optimized joint Tx/Rx 
MMSE to the optimal joint Tx/Rx MMSE and spatial adaptive loading for different reference 
rates R. 

[0044] Fig. 15 illustrates a block diagram of a multi-antenna base station with 
calibration loop. 

[0045] Fig. 16 illustrates the BER degradation with and without calibration. 
[0046] Fig. 17 illustrates a second calibration method. 

Detailed Description of Certain Embodiments 
[0047] The following detailed description of certain embodiments presents various 
descriptions of specific embodiments of the present invention. However, the present invention 
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can be embodied in a multitude of different ways. In this description, reference is made to the 
drawings wherein like parts are designated with like numerals throughout. 

[0048] Embodiments of the systems and methods involve (wireless) 
communication between terminals. One can logically group the terminals on each side of the 
communication and refer to them as peers. The peer(s) on one side of the communication can 
embody at least two user terminals, whereas on the other side the peer(s) can embody at least 
one base station. Thus, the systems and methods can involve multi-user communications. 
The peers can transmit and/or receive information. For example, the peers can communicate 
in a half-duplex fashion, which refers to either transmitting or receiving at one instance of 
time, or in a full duplex fashion, which refers to substantially simultaneously transmitting and 
receiving. 

[0049] Certain embodiments include MIMO wireless communication channels, as 
they have significantly higher capacities than conventional SISO channels. Several MEMO 
approaches may be used, depending on whether channel knowledge is available at either the 
transmit or receive side, as discussed below. 

[0050] Fig. 1 illustrates a downlink communication set-up, and Fig. 2 illustrates 
an uplink communication set-up. Space Division Multiple Access (SDMA) techniques are 
introduced for systems making use of subband processing, and thus is consistent with a 
multi-carrier approach. As shown in Figs. 1 and 2, communication peers 30, 230 include 
terminal(s) 40, 240 disposing of transmitting and/or receiving devices 80, 220 that are able to 
provide different spatial samples of the transmitted and/or received signals. These 
transmission and/or receiving devices are called spatial diversity devices. A peer at the base 
station side is called a processing peer 30, 230. A processing peer communicates with at least 
two terminals at an opposite peer 10, 340, which can operate at least partially simultaneously 
and the communicated signals 1 spectra can at least partially overlap. Note that Frequency 
Division Multiple Access techniques rely on signals spectra being non-overlapping while 
Time Division Multiple Access techniques rely on communicating signals in different time 
slots thus not simultaneously. The opposite peer(s) includes at least two user terminals 20, 
330 (labeled in Figure 1 as User Terminal 1 and User Terminal 2) using the same frequencies 
at the same time, and are referred to as the composite peer(s) 10, 340. The systems and 



methods involve (wireless) communication between terminals whereby at least the 
processing peer(s) 30, 230 disposes of subband processing capabilities. 

[0051] The communication between a composite peer and a processing peer can 
include downlink (Figure 1) and uplink (Figure 2) transmissions. Uplink transmission refers 
to a transmission whereby the composite peer transmits data signals and the processing peer 
receives data signals. Downlink transmission refers to a transmission whereby the processing 
peer transmits data signals and the composite peer receives data signals. The uplink and 
downlink transmissions can be, for example, simultaneous (full duplex) with respect to the 
channel (for example, using different frequency bands), or they can operate in a time-duplex 
fashion (half duplex)(for example using the same frequency band), or any other 
configuration. 

[0052] A (linear) pre-filter can be used at the transmit side, to achieve a block 
diagonalization of the channel. At the receiver side (linear) post- filtering can be applied. 

[0053] (Wireless) transmission of data or a digital signal from a transmitting to a 
receiving circuit includes digital to analog conversion in the transmission circuit and analog 
to digital conversion in the receiving circuit. In addition, the apparatus in the communication 
set-up can have transmission and receiving devices, also referred to as front-end, 
incorporating these analog-to-digital and digital-to-analog conversions, including 
amplification or signal level gain control and realizing the conversion of the RF signal to the 
required baseband signal and vice versa. A front-end can comprise amplifiers, filters and 
mixers (down converters). As such, in the text all signals are represented as a sequence of 
samples (digital representation), thereby assuming that the above-mentioned conversion also 
takes place. This assumption does not limit the scope of the invention though. 
Communication of a data or a digital signal is thus symbolized as the transmission and 
reception of a sequence of (discrete) samples. Prior to transmission, the information 
contained in the data signals can be fed to one or more carriers or pulse-trains by mapping 
said data signals to symbols which consequently modulate the phase and/or amplitude of the 
carrier(s)or pulse-trains (e.g., using quadrature amplitude modulation (QAM) or quadrature 
phase shift keying (QPSK) modulation). The symbols belong to a finite set, which is called 



the transmitting alphabet. The signals resulting after performing modulation and/or front-end 
operations on the data signals are called transformed data signals, to be transmitted further. 

[0054] After reception by the receiving device, the information contained in the 
received signals is retrieved by transformation and estimation processes. In some 
embodiments, these transformation and estimation processes can include demodulation, 
subband processing, decoding and equalization. In other embodiments, these transformation 
and estimation processes do not include demodulation, subband processing, decoding and 
equalization. After said estimation and transformation processes, received data signals are 
obtained, including symbols belonging to a finite set, which is called the receiving alphabet. 
The receiving alphabet is preferably equal to the transmitting alphabet. 

[0055] Embodiments of the invention further include methods and systems for 
measuring the channel impulse responses between the transmission and/or reception devices 
of the individual user terminals at the composite peer on the one hand, and the spatial 
diversity device of the processing peer on the other hand. The channel impulse responses 
measurement can be either obtained on basis of an uplink transmission and/or on basis of a 
downlink transmission. Thus the measured channel impulse responses can be used by the 
processing peer and/or composite peer in uplink transmissions and/or in downlink 
transmissions. This, however, assumes perfect reciprocity between transceiver circuits, which 
usually is not the case in practice because of, e.g., the different filters being used in transmit 
and receive path. Additional methods are discussed below to address the non-reciprocity 
issue. Additional embodiments further include methods for determining the received data 
signal power and methods for determining the interference ratio of data signals. 

[0056] The spatial diversity device ensures the reception or transmission of 
distinct spatial samples of the same signal. This set of distinct spatial samples of the same 
signal is called a spatial diversity sample. In certain embodiments, spatial diversity devices 
embody separate antennas. In these embodiments, the multiple antennas belonging to one 
terminal can be placed spatially apart (as shown in Figures 1 and 2), or they can use a 
different polarization. The multiple antennas belonging to one terminal are sometimes 
collectively called an antenna array. The systems and methods are maximally efficient if the 
distinct samples of the spatial diversity sample are sufficiently uncorrelated. In some 
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embodiments, the sufficiently uncorrelated samples may be achieved by placing different 
antennas apart over a sufficiently large distance. For example, the distance between different 
antennas can be chosen to be half a wavelength of the carrier frequency at which the 
communication takes place. Spatial diversity samples are thus different from each other due 
to the different spatial trajectory from the transmitter to their respective receiver or vice 
versa. Alternatively, said spatial diversity samples may be different from each other due to 
the different polarization of their respective receivers or transmitters. 

[0057] Certain embodiments of the systems and methods rely on the fact that at 
least the processing peer performs an inverse subband processing, called ISP in the sequel, in 
the downlink mode (Figure 1) and subband processing, called SP in the sequel, in the uplink 
mode (Figure 2). Furthermore, in the downlink mode, SP takes place either in the composite 
peer after reception (see Figure 1, bottom) or in the processing peer before ISP (see Figure 1, 
top). In the uplink mode ISP takes place either in the composite peer prior to transmission 
(see Figure 2, bottom) or in the processing peer after SP (see Figure 2, top). Concentrated 
scenarios refer to the situation where both ISP and SP are in either transmission direction 
carried out in the processing peer. The remaining scenarios, e.g., where ISP and SP are 
carried out in different peers in either transmission direction, are referred to as split scenarios. 

[0058] In addition, the communication methods can transmit data signals from 
one peer to another peer, but due to transmission conditions, in fact only estimates of the data 
signals can be obtained in the receiving peer. The transmission methods typically are such 
that the data signal estimates approximate the data signals as closely as technically possible. 

[0059] The systems and methods can include downlink transmission methods for 
communication between a base station and U (>1) user terminals. In some embodiments, a 
double level of spatial multiplexing is used. This refers to the users being spatially 
multiplexed (SDMA) and each user receiving spatially multiplexed bit streams (SDM). The 
methods may further include the steps of (linear) pre-filtering in the base station and possibly 
(linear) post-filtering in at least one of the user terminals. Substantially simultaneously C = 
ZC U independent information signals are sent from the base transceiver station to the U 
remote transceivers, whereby, for each remote transceiver, the information C u signals share 
the same conventional channel. The base transceiver station has an array of N (>1) base 
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station antennas (defining a spatial diversity device). Each of the remote transceivers have an 
array of M u (>1) remote transceiver antennas (also defining a spatial diversity device), M u 
being terminal specific as each terminals may have different number of antennas. One 
advantage to such a method is that each of the U remote transceivers is capable of 
determining a close estimate of the C u independent information signals, said estimate being 
constructed from a M u component signal vector received at the related remote transceivers 
antenna array. The method comprises the step of dividing each of the C independent signals 
into a plurality of streams of C u sub-signals and computing an N-component transmission 
vector U as a weighted sum of C N-component vectors Vi, wherein the sub-signals are used 
as weighting coefficients. 

[0060] Alternatively formulated, this aspect discloses a method for transmitting 
user specific data signals from at least one transmitting terminal 240 with a spatial diversity 
capability 220 to at least two receiving terminals 330 with a spatial diversity capability 320. 
The method comprises: dividing 205 the user data signals 200 into a plurality of streams of 
sub-user data sub-signals 210; determining 250 combined data signals 300 in the transmitted 
signals, whereby the combined data signals are transformed versions of the streams of sub- 
user data sub-signals 210, such that at least one of said spatial diversity device 320 of said 
receiving user terminals only receives data sub-signals being specific for the corresponding 
receiving user terminal (in other embodiments, 'at least one' can be understood to mean 
'substantially all*); inverse subband processing 260 the combined data signals 300; 
transmitting with the transmitting terminal spatial diversity device 220 the inverse subband 
processed combined data signals; receiving on at least one of the spatial diversity receivers 
320 of at least one of the receiving terminals 330 received data signals; determining on at 
least one of said receiving terminals 330 estimates of the specific user data sub-signals from 
the received data signals; and collecting said estimates of the data sub-signals into estimates 
of the data signals. 

[0061] A transmit pre-filter in the base station can be used to achieve a block 
diagonalization of the channel, resulting in a substantially zero multi-user interference. In this 
embodiment, each terminal then only has to eliminate its own inter-stream interference, 
which does not require information from the other users' channels. The vectors Vi are 
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selected such that the M u component signal vector received by the antennas of a particular 
remote transceiver substantially only contain signal contributions directly related to the C u 
sub-signals of the original information signal that the remote transceiver should reconstruct. 
Determining combined data signals is essentially based on the distinct spatial signatures of 
the transmitted combined data signals (SDMA) and is such that the spatial diversity 
capability of a terminal receive the sub-data signals specific for the user of that terminal. This 
approach can be exploited in a multi-user SDMA MIMO TX-RX optimization context, which 
results in a decoupling of this overall optimization into several single user optimizations, 
where each optimization depends on a single-user MEMO channel. The close estimate of one 
of the C u independent information signals is constructed from the M component signal vector 
received at the related remote transceivers antenna array by using the steps comprising: 
selecting M component vectors Pi and computing an M u component receive vector as a 
weighted sum of the M u component vectors Pj wherein the components of the M u component 
signal vector are used as weighting coefficients. Thereafter, the components of the obtained 
weighted sum are combined in order to obtain the desired estimate. The vector Vj, Pi can be 
determined in a joint MMSE optimization scheme, independently for each remote terminal. 

[0062] The transmission can be done substantially simultaneously. The spectra of 
the (transmitted) inverse subband processed combined data signals can be at least partly 
overlapping. 

[0063] In the downlink split scenario, the determination of the data sub-signal 
estimates in the receiving terminals comprises subband processing 350 as shown in Fig. 1. In 
the downlink concentrated scenario determining 250 combined data signals in the 
transmitting terminal comprises: determining intermediate combined data signals 290 by 
subband processing 280 the data sub-signals 210, and determining 270 the combined data 
signals from the intermediate combined data signals. 

[0064] Also included are uplink transmission methods for communication 
between U (>1) user terminals and a base station. As in the downlink case, a double level of 
spatial multiplexing may be used (SDMA-SDM). The methods further include the steps of 
pre-filtering in at least one of the transmitting user terminals and post-filtering in the base 
station. Substantially simultaneously C=SC U independent information signals are sent from 
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the U user terminals to the base transceiver station, whereby, for each user terminal, the 
information C u signals share the same conventional channel. Each of the user terminals have 
an array of M u (>1) transmit antennas (also defining a spatial diversity device), M u being 
terminal specific as each terminal may have different number of antennas. The base 
transceiver station has an array of N (>1) base station antennas (defining a spatial diversity 
device). In this way, the base station may be capable of determining a close estimate of the C u 
independent information signals, said estimate being constructed from a N component signal 
vector received at the base station antenna array. The method further comprises the step of 
dividing each of the C independent signals into a plurality of streams of C u sub-signals and 
computing a M u -component transmission vector as a weighted sum of C M u -component 
vectors, wherein the sub-signals are used as weighting coefficients. 

[0065] Certain embodiments include a method of transmitting data signals 50 
from at least two transmitting terminals 20, each provided with spatial diversity transmitter 
60 to at least one receiving terminal 40 with a spatial diversity receiver 80, comprising: 
dividing 105 said data signals 50 into a plurality of streams of (sub-user) data sub-signals 
108; transforming versions of said streams of said data sub-signals 108 into transformed data 
signals 108; transmitting from said transmitting terminals 20 said transformed data signals 
70; receiving on said spatial diversity receiving device 80 received data signals being at least 
function of at least two of said transformed data signals 70; subband processing 90 of at least 
two of said received data signals in said receiving terminal 40; applying a linear filtering 95 
on said subband processed received data signals, said linear filtering and said transforming 
being selected such that the filtered subband processed received data signals are specific for 
one of said transmitting terminals; determining 150 estimates of said data sub-signals 120 
from said filtered subband processed received data signals 140 in said receiving terminal; and 
collecting said estimates of said data sub-signals into estimates of said data signals. This can 
comprise a joint detection operation, for example, a State Insertion Cancellation. 

[0066] In certain embodiments, the transformed data signals can be transmitted 
substantially simultaneously. The spectra of the transformed data signals can be at least partly 
overlapping. 
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[0067] In the uplink split scenario the transformation of the data sub-signals 108 
(see Fig. 2) to transformed data signals 70 comprises inverse subband processing 160. In the 
uplink concentrated scenario, the determination 150 of data sub-signal estimates from the 
obtained subband processed received data signals in the receiving terminal comprises the 
steps of: determining 100 intermediate estimates 130 of the data sub-signals from the 
subband processed received data signals in the receiving terminal; obtaining the estimates of 
the data sub-signals 120 by inverse subband processing 110 the intermediate estimates. 

[0068] It is a characteristic of some embodiments that said transmission methods 
are not a straightforward concatenation of a Space Division Multiple Access technique and a 
multi-carrier modulation method. The methods for multi-user MIMO transmission include 
the use of a double level of spatial multiplexing. For example, the users may be spatially 
multiplexed (SDMA) and each user can receive spatially multiplexed bit streams (SDM). 
Further, said method includes the steps of pre-filtering in the transmitting station and post- 
filtering in at least one of said receive terminals. 

[0069] Some embodiments implement a multicarrier modulation technique. An 
example of such a multicarrier modulation technique uses Inverse Fast Fourier Transform 
algorithms (IFFT) as ISP and Fast Fourier Transform algorithms (FFT) as SP, and the 
modulation technique is called Orthogonal Frequency Multiplexing (OFDM) modulation. It 
can be stated that in the uplink transmission method, the subband processing is orthogonal 
frequency division demultiplexing. It can also be stated that in the uplink transmission 
method, the inverse subband processing is an orthogonal frequency division multiplexing. It 
can also be stated that in the downlink transmission method, the subband processing is 
orthogonal frequency division demultiplexing. It can also be stated that in the downlink 
transmission method, the inverse subband processing is orthogonal frequency division 
multiplexing. 

[0070] In concentrated scenarios, the processing that is carried out in the 
processing peer on samples between SP 90 280 (see Figs. 1 and 2) and ISP 1 10 260 is called 
subband domain processing 270 100. In split scenarios, the processing that is carried out prior 
to ISP 160 260 in the transmitting terminals and after SP 90 350 in the receiving terminals, is 
called subband domain processing (e.g., item numeral 250 in Fig. 1). "Prior to ISP" refers to 
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occurring earlier in time during the transmission or the reception, and the term "after SP" 
refers to occurring later in time during the transmission or the reception. In concentrated 
scenarios, the signals 130 140 290 300 (as shown in Figs. 1 and 2) between the SP and the 
ISP are called signals in subband domain representation. In split scenarios, the signals 50 300 
200 before the ISP in the transmitting terminals and the signals 360 140 120 after the SP in 
the receiving terminals are called signals in a subband domain representation. 

[0071] In certain embodiments, the subband processing consists of Fast Fourier 
Transform (FFT) processing and the inverse subband processing consists of Inverse Fast 
Fourier Transform (DFFT) processing. FFT processing refers to taking the Fast Fourier 
Transform of a signal. Inverse FFT processing refers to taking the Inverse Fast Fourier 
Transform of a signal. 

[0072] The transmitted sequence can be divided in data subsequences prior to 
transmission. The data subsequences correspond to subsequences that are processed as one 
block by the subband processing device. In case of multipath conditions, a guard interval 
containing a cyclic prefix or postfix is inserted between each pair of data subsequences in the 
transmitting terminal(s). If multipath propagation conditions are experienced in the wireless 
communication resulting in the reception of non-negligible echoes of the transmitted signal 
and the subband processing capability consists of (an) FFT and/or IFFT operation(s), this 
guard introduction results in the substantial equivalence between convolution of the time- 
domain data signals with the time-domain channel response on the one hand and 
multiplication of the frequency-domain data-signals with the frequency-domain channel 
response on the other hand. The insertion of the guard intervals can occur in both 
concentrated and split scenarios. Thus in certain embodiments of a split scenario, the 
transmitting terminal(s) insert guard intervals containing a cyclic prefix or postfix between 
each pair of data subsequences after performing ISP on the data subsequences and before 
transmitting the data subsequences. In another embodiment of a concentrated scenario, the 
guard intervals are inserted in the transmitted sequence between each pair of data 
subsequences without performing ISP on the data subblocks in the transmitting terminal(s). 
This can be formalized as follows by stating that in the uplink transmission methods the 
transformation of the data signals to transmitted data signals further comprises guard interval 
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introduction. The guard interval introduction can be applied in the downlink transmission 
methods. Alternatively overlap and save techniques can be utilized also. 

[0073] The terminal(s) disposing of the spatial diversity device dispose(s) of SP 
and/or ISP capability that enable subband processing of the distinct samples of the spatial 
diversity sample. Also, it disposes of the capability for combinatory processing. Combinatory 
processing refers to process data coming from subbands of the distinct samples in the spatial 
diversity sample. In the combinatory processing, different techniques can be applied to 
retrieve or estimate the data coming from the different distinct terminals or to combine the 
data to be transmitted to distinct terminals. Embodiments include methods for performing the 
combinatory processing, both for uplink transmission and for downlink transmission. 

[0074] Combinatory processing in the downlink includes a communication 
situation whereby the peer disposing of spatial diversity capability, which is referred to as the 
processing peer, transmits signals to the composite peer, which embodies different terminals 
transmitting (at least partially simultaneous) so-called inverse subband processed combined 
data, signals (having at least partially overlapping spectra). Determining 250 (see Fig. 1) 
combined data signals 300 in the transmitting terminal in the downlink transmission method 
refers to the combinatory processing. 

[0075] Combinatory processing in the uplink includes a communication situation 
whereby the peer disposing of spatial diversity capability, which is referred to as the 
processing peer, receives signals from the composite peer, which embodies different 
terminals transmitting (at least partially simultaneous) transformed data signals (having at 
least partially overlapping spectra). The determination of estimates of the data sub-signals 
120 (see Fig. 2) from the subband processed received data signals 140 in said receiving 
terminal in the uplink transmission method refers to the combinatory processing. 

[0076] The downlink transmission methods are now discussed in more detail. 
Consider therefore a base station (BS) with A antennas and U simultaneous user terminals 
(UT) each having Bu (M u ) antennas. The BS simultaneously transmits several symbol 
streams towards the U UTs: CI streams towards UT1, C2 streams towards UT2, and so on. 
Each user terminal UT U receives a mixture of the symbol streams and attempts to recover its 
own stream of C u symbols. To this end, each UT can be fitted with a number of antennas B u 
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greater than or equal to C u (B u > C u ). This transmission scheme can be referred to as SDM- 
SDMA: SDMA achieves the user separation and SDM achieves the per-user stream 
separation. The model is typically used for flat fading channels, but it also applies to 
frequency selective channels with multi-carrier transmission (e.g., OFDM), where flat fading 
conditions prevail on each sub-carrier. 

[0077] Figure 3 illustrates the set-up (the downlink transmission is illustrated 
from right to left). In the embodiment of Fig. 3, at each time instant k, the BS transmits the 
signal vector s(k) obtained by pre-filtering the symbol vector x(k), which itself results from 
stacking the U symbol vectors x u (k) as follows (vectors are represented as boldface lowercase 
and matrices as boldface uppercase; the superscript T denotes transpose): 

s(k)=[s } (k) .... ,,(*)]•■ -F-xM 
x(k)=[x l (k) T ... x u {k) r J (formula 1) 

Assuming flat fading, the signal received by the u th terminal can be written as follows: 

r"(A:) = H K .s(k) + n u (k) (formula 2) 
where Hu are the Bu rows of the full channel matrix H. In other words, Hu is the MEMO sub- 



( v \ 



xA 



channel from the BS to user u. The full channel matrix H has dimension 

\u=\ J 

Each user applies a linear post-filter Gu to recover an estimate of the transmitted symbol 
vector xu(k): 

i-W-G'-r-W (fomu|a3) 
= G"-H u .F.x(A;)+G u 'n tt (£) 

Note that x(k) contains the symbols of all U users, hence MUI can cause severe signal-to- 
noise ratio degradation if not properly dealt with. 

In order to zero out the MUI, in some embodiments the F matrix is designed such that it 
block diagonalizes the channel, e.g., the product H*F is block diagonal with the u th block in 
the diagonal being of dimension B u xB u . This ensures that, under ideal conditions, the MUI is 
substantially eliminated, leaving primarily per-user multi-stream interference, which will be 
tackled by a per-user processing. 
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First, it is noted that H is the vertical concatenation of the U "BS-to-user-u" matrices H u and 
F is the horizontal concatenation of the U pre-filtering matrices F u : 

H = [H ,r ...H i/r ] r (formula4) 
F = [F 1 ---F (/ J 

The block-diagonalization condition is fulfilled if each F u is chosen so that its columns lie in 

the null-space of H c u where H c u is obtained by removing from H the B u rows corresponding 

u 

to user u (so Hc u has ^B k rows): 

Jfc=l,**u 

F u enull{n u c } «> H£ F w =0 (formulas) 
To achieve this, matrix N is introduced which is built as follows: N 1 , the first columns of N, 
is an orthogonal basis for the null space of H c l ; the other columns of N are built in the same 
way for user 2 to U: N = [N 1 ... N u ]. It is easy to see that each N u has D u columns where D u is 
given by: 

u 

D u =A- (formula 6) 

k=\,k*u 

Matrix F is defined as N-E where E is also block diagonal with blocks of dimension B u xC u . 
This constrains F to use, per user u, a linear combination of N u which indeed block 
diagonalizes the full channel matrix. These linear combinations are contained in the sub- 
blocks that make up the E matrix. 

Matrix G is similarly designed as a block diagonal matrix where each block has dimension 
C u xB u . Figure 4 illustrates the various matrices used together with their dimensions. 
Globally, this strategy is advantageous because the pre- and post-filtering (F and G) can be 
calculated independently per user. Zeroing the MUI is also advantageous to combat near-far 
effects. 

In certain embodiments, the requirements on the number of antennas are as follows: 



B u >C U 

A- ^B k >C u 



for all u (formula 7) 



Within these limits, the scheme can accommodate terminals with different numbers of 
antennas, which is an additional advantageous feature. 
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[0078] A joint TX-RX MMSE optimization scheme is modified and extended to 
take the block diagonalization constraint into account. To this end, the joint TX-RX 
optimization per user is computed for a MUI free channel: the optimization is performed over 
channel H U N U for user u. One has the following constrained minimization problem, with P u 
denoting the transmit power of user u and the superscript H the Hermitian transpose: 

nj£[|f M-i-flf] si. traced" Jr)=P» 

min£ ||x , '(it)-G , '(H"N u E"x u (it)+n"(A:)j] (formula 8) 

si. /rac^E^N u "N u E u )=/> u 
The constrained optimization is transformed into an unconstrained one using the Lagrange 

multiplier technique. Then, one can minimize the following Lagrangian: 
L(u,E",G") = 

E \x u (k)-G"{^ u N u E u x u {k)+ii u {k^^ (formula 9) 

+ r (trace(E uH N uH n u E u )- P" ) 
where A" is a parameter that has to be selected to satisfy the power constraint. Following an 

approach similar to [H. Sampath, P. Stoica and A. Paulraj, "Generalised Linear Precoder and 

Decoder Design for MIMO Channels Using the Weighted MMSE Criterion", IEEE 

Transactions on Communications, Vol. 49, No. 12, December 2001] and using the singular 

value decomposition (SVD) of H U N U , one obtains the following transmit and receive filters 

F u andG u , peruser: 

H"N"=U?A(V^)" 

V eJ hn) \ ml j + (formulal0) 

F" = N U E U = N u V^IJ 

where ( . )+ indicates that only the non-negative values are acceptable and, in the last line, 
only the non-zero values of the diagonal matrix are inverted. 

[0079] To illustrate the performance of the proposed SDMA-MIMO scheme, a 
typical multi-user MIMO situation is considered first where a BS equipped with 6 to 8 
antennas is communicating with three 2-antenna UTs. Hence, this set-up has 3x2=6 
simultaneous symbol streams in parallel. The 3 input bit streams at the BS are QPSK 
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modulated and demultiplexed into 2 symbol streams each. Each symbol stream is divided in 
packets containing 480 symbols and 100 channel realizations are generated. The entries of 
matrix H are zero mean independent and identically distributed (iid) Gaussian random 
variables with variance 1 and are generated independently for each packet. The total transmit 
power per symbol period across all antennas is normalized to 1 . 

[0080] Figure 5 shows the performance of the proposed SDMA-MIMO system for 
the joint TX-RX MMSE design in solid line. Also shown in dotted line is the performance of 
a conventional single user MEMO system with the same number of antennas (6 to 8 antennas 
for the BS, 6 antennas for the UT). The scenario where the BS has 6 antennas is the fully 
loaded case: adding more parallel streams would introduce irreducible MUL The scenarios 
where the BS has more than 6 antennas are underloaded and some diversity gain is expected. 
The single user system typically has a better performance since it has more degrees of 
freedom available at the receiver for spatial processing (for the conventional case, the receive 
filter matrix is 6x6 while for our multi-user MEMO case the 3 receive filter matrices are 2x2). 
An advantageous feature of the proposed SDMA MIMO system is that adding just one 
antenna at the BS provides a diversity gain of 1 to all simultaneous users. Also, the difference 
between single user and multi-user performance becomes negligible when the number of BS 
antennas increases. 

[0081] Next, the case is considered of an increased number of antennas at the user 
terminal. Two symbol streams are sent to each terminal and the terminals have 3 antennas 
(same number at each terminal). The BS has 8 or 9 antennas to satisfy the requirement in 
formula (7). For comparison, 2 antennas at the UTs have also been simulated. All other 
parameters are substantially identical to those of the first simulation scenario. The simulation 
results are shown in Figure 6. As expected, increasing the number of BS antennas from 8 to 9 
provides a diversity improvement. However, increasing the number of receive antennas 
results in a reduced performance. This counter-intuitive result is due to the fact that the 
higher number of receive antennas reduces the number of columns of N and, hence, the 
apparent channel dimension over which the MMSE optimization takes place. More 
specifically, for 8 antennas at the BS, one has the matrix dimensions given in Figure 7. It can 
be seen that the actual channel (H u -N u ) available for per-user TX-RX MMSE optimization is 
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smaller when the number of RX antennas is large. The BER curves corresponding to these 
two cases match closely with the BER curves of joint TX-RX MMSE optimization of a 2x4 
and 3x2 MIMO system respectively, with a correction of 101ogi 0 (3) = 4.8 dB. This correction 
is due to the power being divided between three users in this exemplary SDMA MIMO 
system. 

[0082] An SDMA MIMO scheme was proposed that allows to block-diagonalize 
the MIMO channel so that the MUI is completely cancelled. It was applied to a joint TX-RX 
MMSE optimization scheme with transmit power constraint. This design generally results in 
smaller per user optimization problems. The highest - and most economical - performance 
increase is shown to be achieved by increasing the number of antennas at the base station 
side. Increasing the number of antennas at the terminals beyond the number of parallel 
streams must be done carefully. This block diagonalization is very advantageous for MIMO- 
SDMA. In this context, it can be applied to a large range of schemes, including linear and 
non-linear filtering and optimizations, TX-only, RX-only or joint TX-RX optimization. It is 
also applicable for uplink and downlink. Extension to frequency selective channels is 
straightforward with multi-carrier techniques such as OFDM. 

[0083] Consider an approach based on the assumption that the channel is slowly 
varying and hence channel state information can be acquired through either feedback or plain 
channel estimation in TDD-based systems and consider among the possible design criteria, 
the joint transmit and receive Minimum Mean Squared Error (Tx/Rx MMSE) criterion, for it 
is the optimal linear solution for fixed coding and modulation across the spatial subchannels. 
Note that the latter constraint is set to reduce the system's complexity and adaptation 
requirements in comparison to the optimal yet complex bit loading strategy. For a fixed 
number of spatial streams p and fixed symbol modulation, this design devises an optimal 
filter-pair (T,R) that decouples the MIMO channel into multiple parallel spatial subchannels. 
An optimum power allocation policy allocates power only to a selection of subchannels that 
are above a given Signal-to-Noise Ratio (SNR) threshold imposed by the transmit power 
constraint. Furthermore, more power is given to the weaker modes of the previous selection, 
and vice versa. It is clear that the data-streams assigned to the non-selected spatial 
subchannels are lost, giving rise to a high MMSE and consequently a non-optimal Bit-Error 
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Rate (BER) performance. Moreover, the arbitrary and initial choice of the number of streams 
p leads to the use of weak modes that consume most of the power. The previous remarks 
show the impact of the choice of p on the power allocation efficiency as well as on the BER 
performance of the joint Tx/Rx MMSE design. Hence, it is relevant to consider the number 
of streams p as an additional design parameter rather than as a mere arbitrary fixed scalar. 

[0084] In this further aspect of the systems and methods, the issue is addressed of 
optimizing the number of streams p of the joint Tx/Rx MMSE under fixed total average 
transmit power and fixed rate constraints for flat- fading MIMO channels in both a single user 
and multi-user context. 

[0085] The considered point-to-point SM MIMO communication system is 
depicted in Figure 8. It represents a transmitter (Tx) and a receiver (Rx), both equipped with 
multiple antennas. The transmitter first modulates 830 the signal received from coder (COD) 
860 and interleaver (FT) 870 and transmits bit-stream b according to a pre-determined 
modulation scheme (this implies the same symbol modulation scheme over all spatial 
substreams), then it demultiplexes 834 the output symbols into p independent streams. This 
spatial multiplexing modulation actually converts the serial symbol-stream s into p parallel 
symbol streams or equivalently into a higher dimensional symbol stream where every symbol 
now is a /7-dimensional spatial symbol, for instance s(k) at time k. These spatial symbols are 
then pre-filtered by the transmit filter T 810 and sent onto the MIMO channel through the M T 
transmit antennas. At the receive side, the Mr received signals are post- filtered by the receive 
filter R 820. The p output streams conveying the detected spatial symbols (k) are then 
multiplexed 840 and demodulated 844 to recover the initially transmitted bit-stream after 
being fed through deinterleaver (IT 1 ) 880 and decoder (DECOD) 890. For a flat-fading 
MIMO channel, the global system equation is given by 



(formula 11) 
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where n (k) is the MR-dimensional receive noise vector at time k and H 850 is the (Mr x Mt) 
channel matrix whose (ij) th entry, h), represents the complex channel gain from the j th 
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transmit antenna to the i th receive antenna. In the sequel, the sampling time index k is 
dropped for clarity. 

[0086] The transmit and receive filters T 810 and R 820, represented by a (M T x 
p) and (p x Mr) matrix respectively, are jointly designed to minimize the Mean Squared Error 
(MMSE) subject to average total transmit power constraint as stated in: 

Min RT E{ s-(RHTs + Rn) 2 2 } 

(formula 12) 

subject to: trace(TT H )=P T 

The statistical expectation E{} is carried out over the data symbols s and noise samples n. 

Moreover, uncorrelated data symbols and uncorrelated zero-mean Gaussian noise samples 

with variance a n 2 are assumed so that one has 

£(ss H ) = l p E(nn H ) = a 2 n l MR £(sn H ) = 0 (formula 13) 

The trace constraint states that the average total transmit power per /^-dimensional spatial 

symbol s after pre- filtering with T equals Pj. 

[0087] Let H = U- S p -V* be the Singular Value Decomposition (SVD) of the 

equivalent reduced channel corresponding to the p selected subchannels over which the p 

spatially multiplexed data-streams are to be conveyed. Considering the equivalent reduced 

channel corresponding to the p selected subchannels allows the reduction of the later 

introduced St and Sr to their diagonal square principal matrices as one gets rid of their unused 

null-part corresponding to the (MR-p) remaining and unused subchannels. These p spatial 

subchannels are represented by S p , which is a diagonal matrix containing the first strongest p 

subchannels of the actual channel H. The optimization problem stated in formula 12 is solved 

using the Lagrange multiplier technique and leads to the optimal filter-pair (T,R): 

f T = V-E t 

\ ' (formula 14) 

[ R = £ r -U 

where 2 t is the (p x p) diagonal power allocation matrix that determines the power 
distribution among the p spatial subchannels and is given by 



+ 



2 = 



(formula 15) 
subj ect to : traced ) = P r 
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The complementary equalization matrix Z r is the (pxp) diagonal matrix given by: 

FT 

S r =— X, (formula 16) 

where [x] + = max(x,0) and X is the Lagrange multiplier to be calculated to satisfy the trace 
constraint of formula 15. The filter-pair MMSE solution (T, R) of formula 14 clearly 
decouples the MIMO channel matrix H 850 into p parallel subchannels. Among the latter 
available subchannels, those above a given SNR threshold, imposed by the transmit power 
constraint, are allocated power as described in formula 15. Furthermore, more power is 
allocated to weaker modes of the previous selection and vice-versa leading to an asymptotic 
zero-forcing behavior as subsequently shown: 



RHT = 



a/I 

— when <t„->0 (formula 17) 



[0088] The above-described Tx/Rx MMSE design is derived for a given number 
of streams /?, which is arbitrary chosen and fixed. Hence, the filter-pair solution can be 
accurately denoted as (T p , Rp). These p streams will always be transmitted regardless of the 
power allocation policy that may, as previously explained, allocate no power to certain 
subchannels. The streams assigned to the latter subchannels are then lost, contributing to a 
bad overall BER performance. Furthermore, as the SNR increases, these initially disregarded 
modes may eventually be selected and may monopolize most of the power budget, leading to 
an inefficient power allocation solution. Both previous remarks highlight the influence of the 
choice of p on the system performance and power allocation efficiency. Hence, the 
motivation to include p as a design parameter in order to optimize the system performance. 

[0089] For a fixed number of streams p and a fixed symbol modulation scheme 
across these streams, the optimal joint Tx/Rx MMSE solution, given by the filter-pair (T p , 
Rp), gives rise the minimum Mean Squared Error MSEp: 



MSE p = trace 



\ p * p r ) « 



2 
r 

noise contribution 



(formula 18) 



imperfect equalization 

which consists of two distinct contributions, namely the imperfect subchannel gain 
equalization contribution and the noise contribution. Certain embodiments of the systems and 
methods minimize MSEp with respect to the number of streams p under a fixed rate 
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constraint. The same symbol modulation scheme is assumed across the spatial substreams for 
a low-complexity optimal joint Tx/Rx MMSE design. This symbol modulation scheme, 
however, can be adapted to p to satisfy the fixed reference rate R. Hence, the constellation 
size corresponding to a given number of spatial streams p is denoted M p . The proposed 
optimization problem can be drawn: 



[0090] The resulting design (p oph M oph T opt , Ro P t) is referred to as the spatially 
optimized Joint Tx/Rx MMSE design. For rectangular QAM constellations (e.g., E S =2(M P - 
l)/3), the constrained minimization problem formulated in formula 19 reduces to: 



The latter formulation suggests that optimal p opt is the number of spatial streams that enables 
a reasonable constellation size M oph while achieving the optimal power distribution that 
balances, on the one hand, the achieved SNR on the used subchannels and, on the other hand, 
the receive noise enhancement. 

[0091] To illustrate the existence of p oph the optimization problem of formula 20 
is solved for a case-study MIMO set-up where Mf=MR=6. An average total transmit power 
Pt is assumed Pr=l, m average receive SNR=20 dB and a reference rate R=\2 bits/channel 
use. Moreover, as for all the included simulations, the MIMO channel is considered to be 
stationary flat-fading and is modeled as a M R x M T matrix with iid unit-variance zero-mean 
complex Gaussian entries. Moreover, perfect (error-free) Channel State Information (CSI) is 
assumed at both transmitter and receiver sides. Figure 9 shows p op t's existence (a) and 
distribution (b) when evaluated over a large number of channel realizations. 

[0092] The reference rate R certainly determines, for a given (Mr, M R ) MIMO 
system, the optimal number of streams p opt as the MSEp explicitly depends on R as shown in 
formula 20. This is illustrated in Figure 10 where p opt clearly increases as the reference rate R 
increases. Indeed, to convey a much higher rate R at reasonable constellation sizes, a larger 
number of parallel streams is desirable. 

[0093] The dependence of p opt on the SNR is investigated. For a sample channel 
of the previous MIMO case study, Figure 10 illustrates the system's MSEp for different SNR 




Min p E{MSE p ) 
tto: pxlog2(M p ) = R 



(formula 19) 




(formula 20) 
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values. As expected, the MSEp globally diminishes as the SNR increases. The optimal 
number of streams p opU however, stays the same for the considered channel. This result is 
predictable since the noise power a n 2 is assumed to be the same on every receive antenna. In 
these circumstances, the power allocation matrix Z t basically acts on the subchannel gains 
(a p ) (l<p<min(MT,MR)) in E p trying to balance them while Z r equalizes these channel gains. 
Consequently, to convey a reference rate R through a given (Mr, Mr) MEMO channel using a 
transmit power P T> there exists a unique p opt which is independent of the SNR. This allows 
the assumption of the asymptotic high SNR situation when computing p opt . In a high SNR 
situation, the power budget is sufficient for E t to select and allocate power to all p necessary 
modes as shown in formula 17. From this, one can find the expression of the Lagrange 
multiplier X and re-write formula 15 as follows 

-2 



' ' E s " 

x2 (formula 21) 

y jE s aJrace^) 



where : X = 



MSE p = - LJL 1 (formula 22) 



P T + <jltrace{ir*)j 

Using the previous expressions of E t and X and that of 2 r given in formula 16, the expression 
of MSEp reduces to: 

E s <T 2 n trace{^) 2 
P T +<j 2 n trace{YT 2 ) 

For high SNRs, the second term in the denominator a n 2 *trace(2 p " 2 ) is negligible compared to 
P T . Furthermore, the noise level can be removed. Hence, a simplified error expression Err p 
can be drawn 

Err p = | (2 R/p - 1) • trace^ f (formula 23) 

[0094] The complex MSEp expression of formula 20, which depends on a large 
amount of parameters and which is composed of highly inter-dependent quantities, can be 
reduced to a simplified expression Err p that preserves the same monotony and thus the same 
p opt as corroborated in Figure 8. The simplified Err p , expressed in formula 23, is a product of 
only two terms, each depending on a single system parameter, namely the channel singular 
matrix E p and the reference rate R. The proposed simplified Err p eases p op tS computation and 
more advantageously does not require noise power estimation. 
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[0095] Previously, for channel realization, the existence was exhibited of an 
optimal number of spatial streams p opt which minimizes the system's MMSEp. Consequently, 
a spatially optimized joint Tx/Rx MMSE design is described that adaptively determines and 
uses popt and its corresponding constellation size M opt . In this section, it is investigated how 
the proposed design bit error rate (BER) performance compares to those of the conventional 
joint Tx/Rx MMSE and the optimal spatial adaptive loading, where the number of spatial 
streams p is arbitrarily fixed. Figure 13 depicts the BER performance of the conventional 
joint Tx/Rx MMSE design for different fixed number of streams and that of our spatially 
optimized joint Tx/Rx MMSE for the case study (6,6) MIMO system. The optimized joint 
Tx/Rx MMSE offers a 10.4 dB SNR gain over full spatial multiplexing, where the maximum 
number of spatial streams is used />=min(M T5 M R ), at BER=10' 2 and reference rate R = 12 
bits/channel use. Such a significant performance improvement can be attributed to one or 
more of several reasons. First, the optimized joint Tx/Rx MMSE design is mostly using 
Po P t=3, as can be seen in Figure 9, which is lower than p=6 used in the full spatial 
multiplexing case. Reducing the number of used spatial streams allows a better exploitation 
of the system's spatial diversity, which explains in part the observed higher curve slope. 
Second, reducing the number of used streams translates into a higher gain equivalent channel. 
The optimized joint Tx/Rx MMSE design uses the best p opt subchannels and discards the 
weak ones. In addition, the optimized constellation size M opt guarantees an optimal BER 
performance. The latter point illustrates that the joint Tx/Rx MMSE design outperforms the 
conventional joint Tx/Rx MMSE design with fixed p=2 streams, whereas the latter case 
better illustrates the former points. 

[0096] Figure 14 illustrates the comparison between the BER performance of the 
spatially optimized joint Tx/Rx MMSE and that of the optimal joint Tx/Rx MMSE design for 
the previously considered MIMO set-up and different reference rates, namely /?= {12,18,24} 
bits per channel use. The optimal joint Tx/Rx MMSE refers to the design that adaptively (for 
each channel realization) determines and uses the number of spatial streams p and the 
constellation size M p that minimizes the system BER under average total transmit power and 
rate constraints. Figure 14 also illustrates a lower BER bound corresponding to the optimal 
performance of spatial adaptive loading, combined with MMSE detection. The loading 
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algorithm used herein is the Fischer algorithm, although other analogous algorithms could 
alternatively be used. 

[0097] The spatially optimized design clearly exhibits the same average 
performance as the optimal MMSE. This suggests that the optimization criterion, namely 
global MSE minimization (see, e.g., formula 19), equivalently minimizes the system's BER. 
Furthermore, the optimized joint Tx/Rx MMSE design exhibits less than 2 dB SNR loss at 
BER=10" 3 compared to the spatial adaptive loading. This performance difference can be 
attributed at least in part to the adaptive loading that adapts not only the used number of 
streams, but also the constellation sizes across these streams, to achieve the lowest possible 
BER performance. The optimized joint Tx/Rx MMSE design assumes fixed constellations 
across the spatial streams to reduce the adaptation requirements and complexity. In addition, 
the optimized joint Tx/Rx MMSE appears to achieve the same diversity order as spatial 
adaptive loading, as their BER curves have substantially the same slope. 

[0098] In some embodiments, wherein the number of spatial streams used by the 
spatial multiplexing joint Tx/Rx MMSE design are optimized, the spatial diversity offered by 
MIMO systems are better exploited, and hence, significantly improve the system's BER 
performance. Thus, the systems and methods include a new spatially optimized joint Tx/Rx 
MMSE design. For a (6,6) MIMO set-up, the latter proposed design exhibits a 10.4 dB gain 
over the full spatial multiplexing conventional design for a BER = 10' 2 , a unit average total 
transmit power, a reference rate R=\2 bits per channel use and iid channel. Furthermore, the 
optimality of the spatially optimized joint Tx/Rx MMSE is shown for fixed modulation 
across streams. Including the number of streams as a design parameter for spatial 
multiplexing MIMO systems can provide significant performance enhancement. 

[0099] An alternative spatial-mode selection criterion targets the minimization of 
the system BER, which is applicable for both uncoded and coded systems. This criterion 
examines the BERs on the individual spatial modes in order to identify the optimal number of 
spatial streams to be used for a minimum system average BER. 

[0100] Both described conventional and even-MSE joint Tx/Rx MMSE designs 
have been derived for a given number of spatial streams p which is arbitrarily chosen and 
fixed. These p streams will always be transmitted regardless of the power allocation policy 
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that may, as previously highlighted, allocate no power to certain weak spatial subchannels. 
The data streams assigned to the latter subchannels are then lost, leading to a poor overall bit 
error rate (BER) performance. Furthermore, as the SNR increases, these initially disregarded 
modes will eventually be given power and will monopolize most of the available transmit 
power, leading to an inefficient power allocation strategy that detrimentally impacts the 
strong modes. Finally, it has been shown that the spatial subchannel gains exhibit decreasing 
diversity orders. This means that the weakest used subchannel sets the spatial diversity order 
exploited by joint Tx/Rx MMSE design. The previous remarks highlight the influence of the 
choice of p on the transmit power allocation efficiency, the exhibited spatial diversity order 
and thus on the joint Tx/Rx MMSE designs' bit error rate performance. Hence, it 
alternatively is proposed to include p as a design parameter to be optimized according to the 
available channel knowledge for an improved system BER performance, which is 
subsequently referred to as spatial-mode selection. This approach is applicable for both 
uncoded and coded systems. 

[0101] It is advantageous to achieve a spatial-mode selection criterion that 
minimizes the system's BER. In order to identify such criterion, we can subsequently derive 
the expression of the conventional joint Tx/Rx MMSE design's average BER and analyze the 
respective contributions of the individual used spatial modes. For the used Gray-encoded 
square QAM constellations of size M p and minimum Euclidean distance d min = 2 , the 

conventional joint Tx/Rx MMSE design's average BER across p spatial modes, denoted 
BER conv , is approximated by 



BER mt = 1 



1- 1 




(formula 24) 



where a k denotes the k th diagonal element of S , which represents the k th spatial mode 
gain. Similarly, a Tfc is the k th diagonal element of S r , whose square designates the transmit 

a 2 a 2 

power allocated to the k spatial mode. Hence, the argument jL -p L is easily identified as the 

average Signal-to-Noise Ratio (SNR) normalized to the symbol energy E s , on the k th spatial 
mode. For a given constellation M p , these average SNRs clearly determine the BER on their 
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corresponding spatial modes. The conventional design's average BER performance, however, 
depends on the SNRs on all p spatial modes as shown in formula (24). Consequently, the 
(P X P) diagonal SNR matrix SNR P better characterizes the conventional design's BER, 
whose diagonal consists of the average SNRs on the p spatial modes: 

S 2 -S 2 

SNR P =- JL T L (formula 25) 
Replacing the transmit power allocation matrix I T by its expression formulated in formula 
(21), the previous SNR P expression can be developed into: 



SNRn = 



(formula 26) 



The latter expression illustrates that the conventional joint Tx/Rx MMSE design induces 
uneven SNRs on the different p spatial streams. More importantly, formula (26) shows that 
the weaker the spatial mode is, the lower its experienced SNR. Since, the conventional joint 
Tx/Rx MMSE BER, BER conv , of formula (24) can be rewritten as follows: 



BER^, — 



C °" V log 2 (M p ) 



f \ 

i- 1 



V V p J 



•-Z^(>/SNR#^)) (formula27) 



P k=\ 

The previous SNR analysis further indicates that the p spatial modes exhibit uneven BER 

contributions and that that of the weakest p th mode, corresponding to the lowest SNR 

SNR p (p,p), dominates BER conv . Consequently, in order to minimize BER conv , the optimal 

number of streams to be used, p , may be the one that maximizes the SNR on the weakest 

used mode under a fixed rate R constraint. The latter proposed spatial-mode selection 
criterion can be expressed: 

Max, SNR P (p,p) 

(formula 28) 

subj ect to : p x log 2 (M ) = R 

The rate constraint shows that, although the same symbol constellation may be used across 
spatial streams, the selection/adaptation of the optimal number of streams p opt includes the 

joint selection/adaptation of the used constellation size M opt . Using formula (26) for the 
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considered square QAM constellations (i.e£ s = 2(M p -l)/3), the spatial-mode selection 

criterion stated in formula (24) can be further refined into: 

r \ 



BER = 



c ° nv log 2 {M p ) 



2 1 1 1 ^ 



-%erfcUsNR p (k,k)) (formula 27) 

P k=\ K } 



The previous SNR analysis further indicates that the p spatial modes exhibit uneven BER 

contributions and that that of the weakest p th mode, corresponding to the lowest SNR 

SNRp(p,p), dominates BER conv . Consequently, in order to minimize BER conv 9 the optimal 

number of streams to be used, p opt , may be the one that maximizes the SNR on the weakest 

used mode under a fixed rate R constraint. The latter proposed spatial-mode selection 
criterion can be expressed: 

jMax, SNR„Cp,/?) 
[subject to : p x log 2 (M p ) = R 
The rate constraint shows that, though the same symbol constellation is used across spatial 
streams, the selection/adaptation of the optimal number of streams p opt includes the joint 

selection/adaptation of the used constellation size M opt . Using formula (26) for the 

considered square QAM constellations (i.eE s =2(M p -l)/3), the spatial-mode selection 

criterion stated in formula (28) can be further refined into: 



(formula 28) 



/\, p ,=argMax ; 



■a-- 



(formula 2 9) 



_crjf(2«* -\)X ' f(2 R/ "-l) 
The latter spatial-mode selection problem has to be solved for the current channel realization 
to identify the optimal pair { p opt 9 M opt } that minimizes the system's average BER, BER conv . 

[0102] A spatial-mode selection is derived based on the conventional joint Tx/Rx 
MMSE design because this design represents the core transmission structure on which the 
even-MSE design is based. An exemplary strategy is to first use a spatial-mode selection to 
optimize the core transmission structure {Z^E^S^}, the even-MSE then additionally 

applies the unitary matrix Z , which is now a p opt -tap IFFT, to further balance the MSEs and 

the SNRs across the used p opt spatial streams. 
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[0103] A key element in the method described above is the block diagonalization 
concept, which is to be realized by carefully determining the matrices F and G. Indeed, the 
pre-filtering at the base station allows the pre-compensation of the channel phase (and 
amplitude) in such a way that simultaneous users receive their own signal free of MUL 
Additionally, this technique includes quasi-perfect downlink channel knowledge, which can 
be acquired during the uplink or during the downlink and fed back by signaling. From the 
point of view of minimizing the signaling overhead and resistance to channel time-variations, 
the former approach is preferred. One starts from the assumption that the channel is 
reciprocal, so that the downlink channel matrix is simply the transpose of the uplink channel 
matrix. 

[0104] However, the 'channel' is actually made up of several parts: the 
propagation channel (the medium between the antennas), the antennas and the transceiver 
RF, IF and baseband circuits at both sides of the link. The transceiver circuits are usually not 
reciprocal and this can jeopardize the system performance. 

[0105] A system with a multi-antenna base station and a single antenna terminal 
is described. Note, however, that in other embodiments, the system is extended to a full 
MIMO scenario (multi-antenna base station and a multi-antenna terminal). 

[0106] In the uplink, U user mobile terminals transmit simultaneously to a BS 
using A antennas. Each user u employs conventional OFDM modulation with N sub-carriers 
and cyclic prefix of length P. Each user signal is filtered, up-converted to RF, and transmitted 
over the channel to the BS. Each BS antenna collects the sum of the U convolutions and add 
white Gaussian noise (AWGN) noise. In each antenna branch, the BS then down-converts 
and filters the signals, removes the cyclic prefix and performs direct Fourier transform, which 
yields the frequency domain received signals y a [n]. If the cyclic prefix is sufficiently large 
and with proper carrier and symbol synchronization, the BS observes the linear channel 
convolutions as cyclic and the following linear frequency domain model results on each sub- 
carrier n: 
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(formula 30) 



y^n] H^H x^n] n W 



where x UL [n] is the column vector of the U frequency domain symbols at sub-carrier n 
transmitted by the terminals, y UL [n] is the column vector of the A signals received by the BS 
antenna branches, and H UL is the composite uplink channel: In the sequel, the explicit 
dependency on [n] is dropped for clarity. 

[0107] Including the terminal transmitters and the BS receiver, H UL [n] can be 
expressed as: 



H 



UL 



= D rx,bs * H ' d tx,mt (formula 31) 



where Drx,bs and Dtx,mt are complex diagonal matrices containing, respectively, the BS 
receiver and mobile terminal transmitters frequency responses (as used herein, the letter D 
signifies that the matrices are diagonal). The matrix H includes the propagation channel 
itself, which is reciprocal. In order to recover the transmitted symbols, the BS uses a channel 

estimation algorithm that provides the estimate H UL affected by Drx,bs and Djx,mt. 

[0108] For embodiments of the downlink, SDMA separation is achieved by 
applying a per-carrier pre- filter that pre-equalizes the channel. This pre-filtering is included in 
the F DL matrix of the frequency domain linear model: 

y DL =H DL .F DL -D p -x DL +n (formula 32) 
where x DL is the column vector of the U symbols transmitted by the BS, y DL is the column 
vector of the U signals received by the terminals, D P is an optional power scaling diagonal 
matrix and H DL is the composite downlink channel H DL is also affected by the BS and 
terminals hardware: 

H DL = Drx,mt • H T • D TX3S (formula 33) 

where Dtx,bs and Drx,mt are complex diagonal matrices containing, respectively, the BS 
transmitter and mobile terminal receivers frequency responses and H T is the transpose of H, 
the uplink propagation channel; clearly, one can use H T for the downlink if the downlink 
transmission occurs without significant delay after the uplink channel estimation, compared 
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to the coherence time of the channel. For the following description, the channel is assumed to 
be static or slowly varying, which is a valid assumption for indoor WLAN channels. 

[0109] For the channel inversion strategy, the pre-filtering matrix is the inverse 
(or pseudo-inverse if U<A) of the transpose of the uplink channel matrix so that, preferably, 
the product of the pre-filtering matrix and the downlink channel matrix is the identity matrix. 
Assuming substantially perfect channel estimation, one can substitute to H UL and 

express F DL as: 



[0110] Finally, replacing F and H in the downlink linear system model (see 
formula 32), the received downlink signal per sub-carrier becomes: 



Note that the introduction of D p allows this model to support other downlink strategies such 
as channel orthogonalization and, more generally, power control in the downlink. 

[0111] The linear model presented above lends itself to several useful 
interpretations and highlights the origin of the MUI: 

• The effect of channel pre-filtering may be altered by the two diagonal matrices 



appearing between H T and H~ T . This is due to transceiver effects at the BS. What 



identity matrix multiplied by a scalar, although this product is diagonal. However, 
the identity matrix, multiplied by an arbitrary complex scalar, could be "inserted" 
between H T and H" T without causing MUI. 
• The terminal front-end effects (Dtx.mt and Drx,mt) generally do not contribute to 
MUI. However, even with perfect BS reciprocity, Djx,mt and Drx,mt will result in 
scaling and rotation of the constellations received at the UT, which imposes the 
use of an equalizer at the UT. The power scaling matrix D p also contributes to 
amplitude modifications that must be equalized at the terminal. Note that the 
terminal equalizer is a conventional time-only equalizer (as opposed to space-time 



F DL =(H UL )r T =(h ul )" T =(D RXjBS -H-D TXjMT )- T (formula34) 




D P x DL +n 



(formula 35) 



Power 



causes MUI is the BS non-reciprocity: D T x,bs • (Drx,bs)" is not equal to the 
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equalizers). This equalizer is also useful to compensate the unknown phase of the 
base station RF oscillator at TX time. 
• The propagation matrix H in this model also includes the parts of the BS or 
terminals that are common to uplink and downlink, hence reciprocal. This is the 
case for the antennas and for any common component inserted between the 
antenna and the Tx/Rx switch (or circulator). 
[0112] In an embodiment of the system, the MUI introduced by the BS front-end 
can be avoided by a calibration method that allows measuring the Dtx,bs.(Drx,bs) _1 product at 
the BS so that the mismatches can be pre-compensated digitally at the transmitter. 

[0113] The block diagram of the SDMA BS Transceiver with the calibration 
hardware is illustrated in Figure 15. In this block diagram, the complex frequency response of 
each transmitter 1510 1514 is represented by a single transfer function d TX BS a [n] 1530 1534, 

which is, for the transmitter 1514 of antenna branch a, the concatenation (product) of the 
frequency response of the baseband section with the low pass equivalent of the IF/RF section 
frequency response. These terms are the diagonal elements of the D T x,Bs[n] matrix. A similar 
definition holds for d^ ^ ^n] 1540 1544. 

[0114] Before calibration, the carrier frequency and the transceiver parameters 
that have an effect on the amplitude or phase response of the transmitter 1510 1514 and/or 
receiver 1520 1524 are set. This includes attenuator, power level, pre-selection filters, carrier 
frequency, gain of variable gain amplifiers, etc. Note that this may require several calibrations 
for a given carrier frequency. Once the parameters are set, the frequency responses are 
assumed static. The calibration is achieved in two steps: TX-RX calibration and RX-only 
calibration. Note that all described calibration operations are complex. 

[0115] In one step, the transmit-receive calibration is performed (measurement of 
Dtx,bs . Drx,bs). A transmit/receive/calibration switch 1550 1554 is put in calibration mode: 
T and R are connected, so as to realize a loopback connection where the transmitter signal is 
routed all the way from baseband to RF and back from RF to baseband in the receiver 1520 
1524. The RF calibration noise source 1580 is turned off. In each antenna branch, a suitable 
known signal s a (an OFDM symbol with low peak-to-average power ratio) is generated by the 
digital modem so as to measure the frequency response of the cascaded transmitter and 
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receiver. With the usual assumptions of perfect synchronization and cyclic prefix length, the 
frequency domain received signal is: 

r i k = D TX ' Drx * s + n k (formula 36) 
where s=[si ... s A ] T and n k is a noise vector, the main contribution of which comes from the 
LNA noise figure 1560 1564. K measurements are taken, which is reflected by the index k. 
The D T x,bs • Drx.bs product can be estimated by averaging the K values of ri k : 

1 K / \ 

a = — X r k = dia §( D Tx,Bs • Drx,bs ) (formula 37) 

k=l 

[0116] In an additional step, there is only receive calibration (measurement of 
Drx,bs). The transmit/receive/calibration switch 1550 1554 is connected so as to isolate the 
receiver 1520 1524 from both the transmitter 1510 1514 and the antenna 1570 1574. The 
calibration noise source 1580 is turned on. Its excess noise ratio (ENR) is typically sufficient 
to exceed the thermal noise generated by the LNAs 1560 1564 by 20 dB or more. The signal 
is sampled and measured at baseband in the receiver 1520 1524 of all antenna branches 
substantially simultaneously, which is advantageous for perfect phase calibration. The 
received frequency domain signal is: 

r 2 k = d rxbs ' <f + n k (formula 38) 
where n re f is the reference noise injected at RF, substantially identical at the input of all 
antenna branches. Drx,bs cannot be extracted directly, even by averaging, since n re f appears 
as a multiplicative term. However, since an identical error coefficient in all antenna branches 
is allowed, n re f can be eliminated by using the output of one of the antenna branches as 
reference and dividing the outputs of all branches by this reference value. Without loss of 
generality, we will take the signal in the first antenna branch r 2 *, as reference. This division 
operation yields: 

»> k = 4" • '. k = j 1 -1T— T (Drx,bs • < + » k ) (f^ula 39) 

f 2,l a RX,BS,t ' n ref + n i 

If the reference noise n re f is much larger than the receiver noise n, this reduces after averaging 
K measurements to: 
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1 K 1 
c = -2X = Drx,bs (formula 40) 

which is a vector containing the frequency responses of the receiver branches with a complex 
error coefficient, common to all antenna branches. 

In the division process, the r* x term is normally dominated by the reference noise multiplied 
by the frequency response BS l of the receiver chain of the first antenna branch. The 

magnitude of this frequency response is by design non-zero since the filters have low ripple 
and are calibrated in their passband. However, the amplitude of this term can be shown to be 
Rayleigh distributed and, hence, can in some cases be very low. It is therefore advantageous 
to substantially eliminate these values before the averaging process since the non-correlated 
LNA noise 1560 1564 is dominant in these cases. A suitable criteria is to remove from the 
averaging process those realizations where the absolute value of r*, is smaller than 0.15 ... 
0.25 times its mean value. Finally, the desired value is given by: 

a J ( c ) 2 = dL,Bs,i • diag(D TX BS ■ BS _1 ) (formula 41) 
where ./ stands for element-wise division. These are the values that, in some embodiments, 
are pre-compensated digitally before transmission. The unknown dlx BS] factor is 

substantially identical in all branches, but this does not introduce MUI. 

[0117] A variance analysis of the estimation error of the Dtx,bs . (Drx,bs) _1 
product shows that with very mild parameter setting (20 dB ENR, 64 point FFT and 32 
averages), one reaches an amplitude variance of 0.0008 and a phase variance of 0.0009. At 1 
sigma, this corresponds to 0.24 dB amplitude and 1.72° phase differences. This can easily be 
improved with higher ENR ratio and/or more averaging. These values were obtained for 
errors before calibration as large as ±3 dB for the amplitude and ±tc for the phase. 

[0118] Figure 16 shows the BER degradations with and without calibration. '5 
degr. & 0.7 dB reciprocity mismatch' indicates that both the phase and amplitude mismatches 
described above were introduced (a difficult matching requirement for complete TX and RX 
chain). Note that the 4-user case at a BER of 10" 3 is not targeted because the required SNR is 
higher than 25 dB, even with ideal calibration. 
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[0119] Although this calibration method relieves the TX and RX chain from any 
matching requirement, it does introduce some matching requirement on the calibration 
hardware. For example: 

• The splitter (1 590 in Figure 1 5) outputs are matched between branches 

• The directional couplers are matched 

• The TX/RX/calibration switch (1550 1554 in Figure 15) are matched between 
branches 

Mismatches in the calibration hardware can be included in the model by including 4 
additional diagonal matrices (T, R, A and C are indicated in Figure 15): 

• Dta (TX-to- Antenna switch transfer function) 

• D T r (TX-to-RX switch transfer function) 

• D A r (Antenna-to-RX switch transfer function) 

• Dcr (Calibr. Noise-to-RX transfer function) 

Then, the downlink model becomes: 

y DL =D RX)MT H T D TA D CR D-^D^H- T D^ MT DpX DL +n (formula 42) 
The mismatches introduced by the calibration hardware is advantageously minimized. 

However, this matching requirement is limited to a few components and is easier to achieve 

than the transmitter and receiver matching required when no calibration is included 

(matching of overall transfer functions including filters, mixers, LO phases, amplifiers, 

etc.). 

[0120] An alternative way to solve the non-reciprocity problem is illustrated in 
Figure 17. The meaning of the references is as follows: Tl= complete TX transfer function 
(TF) until input of directional coupler #1 (DC1); Rl= complete RX TF from DC1 input until 
the end of RX1 (so T/R switch 1740 1744 is included in Tl and Rl); D1=TF of DC1 in the 
direct path; C1=TF from DC1 input, through coupled port until combined port of top 
splitter/combiner; T2, R2, D2 and C2 are similarly defined; TR=TF of reference TX until 
combined port of bottom splitter; and RR=TF from combined port of bottom splitter until the 
end of reference RX. 

[0121] The unknowns TX1 1710, RX1 1720, TX2 1714, RX2 1724 are to be 
determined. Measurements are taken from TX1 1710 to RX-R 1730 and from TX2 1714 to 
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RX-R 1730, yielding MT1= Tl x CI x RR and MT2 = T2 x C2 x RR. Next, measurements 
are taken from TX-R 1734 to RX1 1720 and from TX-R 1734 to RX2 1724, yielding MR1 = 
TR x CI x Rl and MR2 = TR x C2 x R2. In a following step, the ratio of TX over RX 
measurements is computed for each branch: 

• MT1/MR1 = (Tl x CI x RR)/(TR x CI x Rl) = (Tl/Rl) x (RR/TR) 

• MT2/MR2 = (T2 x C2 x RR)/(TR x C2 x R2) = (T2/R2) x (RR/TR) 

These ratios are the desired measurements (Tl/Rl) and (T2/R2) with a common 
multiplicative error (RR/TR), which typically does not affect the reciprocity. 

[0122] The measurement of the first branch is performed substantially 
simultaneously to the measurement of the second branch. The LO and sampling clock in the 
reference TX and RX are locked to the ones in the antenna branches. If needed, to measure 
the TX1 1710 and TX2 1714, the FDMA scheme with the sub-carriers can be used (e.g., odd 
sub-carriers from TX1, even sub-carriers from TX2). This approach offers several 
advantages: nothing needs to be calibrated or matched; compatibility with DBD3 (shared HW 
in Tx and Rx); can be used for more than 2 branches; no 4-position switch. 
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